Takahide TERADA Haruki FUKUDA Tadahiro KURODA
A rotating shaft with attached sensors is wrapped in a two-dimensional waveguide sheet through which the data and power are wirelessly transmitted. A retrodirective transponder array affixed to the sheet beamforms power to the moving sensor to eliminate the need for a battery. A universal on-sheet reference scheme is proposed for calibrating the transponder circuit delay variation and eliminating a crystal oscillator from the sensor. A base signal transmitted from the on-sheet reference device is used for generating the pilot signal transmitted from the sensor and the power signal transmitted from the transponder. A 0.18-µm CMOS transponder chip and the sheet with couplers were fabricated. The coupler has three resonant frequencies used for the proposed system. The measured propagation gain of the electric field changes to less than ±1.5dB within a 2.0-mm distance between the coupler and the sheet. The measured power transmission efficiency with beamforming is 23 times higher than that without it. Each transponder outputs 1W or less for providing 3mW to the sensor.
Daisuke MIZOGUCHI Noriyuki MIURA Hiroki ISHIKURO Tadahiro KURODA
A wireless transceiver utilizing inductive coupling has been proposed for communication between chips in system in a package. This transceiver can achieve high-speed communication by using two-dimensional channel arrays. To increase the total bandwidth in the channel arrays, the density of the transceiver should be improved, which means that the inductor size should be scaled down. This paper discusses the scaling theory based on a constant magnetic field rule. By decreasing the chip thickness with the process scaling of 1/α, the inductor size can be scaled to 1/α and the data rate can be increased by α. As a result, the number of aggregated channels can be increased by α2 and the aggregated data bandwidth can be increased by α3. The scaling theory is verified by simulations and experiments in 350, 250, 180, and 90 nm CMOS.
Lechang LIU Keisuke ISHIKAWA Tadahiro KURODA
Parametric resonance based solutions for sub-gigahertz radio frequency transceiver with 0.3V supply voltage are proposed in this paper. As an implementation example, a 0.3V 720µW variation-tolerant injection-locked frequency multiplier is developed in 90nm CMOS. It features a parametric resonance based multi-phase synthesis scheme, thereby achieving the lowest supply voltage with -110dBc@ 600kHz phase noise and 873MHz-1.008GHz locking range in state-of-the-art frequency synthesizers.
Li-Chung HSU Masato MOTOMURA Yasuhiro TAKE Tadahiro KURODA
This paper presents work on integrating wireless 3-D interconnection interface, namely ThruChip Interface (TCI), in three-dimensional field-programmable gate array (3-D FPGA) exploration tool (TPR). TCI is an emerging 3-D IC integration solution because of its advantages over cost, flexibility, reliability, comparable performance, and energy dissipation in comparison to through-silicon-via (TSV). Since the communication bandwidth of TCI is much higher than FPGA internal logic signals, in order to fully utilize its bandwidth, the time-division multiplexing (TDM) scheme is adopted. The experimental results show 25% on average and 58% at maximum path delay reduction over 2-D FPGA when five layers are used in TCI based 3-D FPGA architecture. Although the performance of TCI based 3-D FPGA architecture is 8% below that of TSV based 3-D FPGA on average, TCI based architecture can reduce active area consumed by vertical communication channels by 42% on average in comparison to TSV based architecture and hence leads to better delay and area product.
Keita TAKATSU Hirotaka TAMURA Takuji YAMAMOTO Yoshiyasu DOI Koichi KANDA Takayuki SHIBASAKI Tadahiro KURODA
A 60-GHz injection-locked frequency divider (ILFD) is presented. A multi-order LC oscillator topology is proposed to enhance the locking range of the divider. A design guideline is described based on a theoretical analysis of the locking range enhancement. A test chip is fabricated in 65 nm CMOS. Measured locking range with 0 dBm input power is 48.5–62.9 GHz (25.9%), which is 63.6% wider compared to the previously reported ILFD. Power consumption excluding buffers and biasing circuits is 1.65 mW from 1.2 V supply. The core ILFD area is 0.0157 mm2 even with an extra pair of inductors.
Xiaolei ZHU Yanfei CHEN Sanroku TSUKAMOTO Tadahiro KURODA
The performance of successive approximation register (SAR) analog-to-digital converter (ADC) is well balanced between power and speed compare to the conventional flash or pipeline architecture. The nonlinearities suffer from the CDAC mismatch and comparator offset degrades SAR ADC performance in terms of DNL and INL. An on chip histogram-based digitally assisted background calibration technique is proposed to cancel and relax the aforesaid nonlinearities. The calibration is performed using the input signal, watching the digital codes in the specified vicinity of the decision boundaries, and feeding back to control the compensation capacitor periodically. The calibration does not require special calibration signal or additional analog hardware which is simple and amenable to hardware or software implementations. A 9-bit SAR ADC with split CDAC has been implemented in a 65 nm CMOS technology and it achieves a peak SNDR of 50.81 dB and consumes 1.34 mW from a 1.2-V supply. +0.4/-0.4 LSB DNL and +0.5/-0.7 LSB INL are achieved after calibration. The ADC has input capacitance of 180 fF and occupies an area of 0.10.13 mm2.
Shusuke YANAGAWA Ryota SHIMIZU Mototsugu HAMADA Toru SHIMIZU Tadahiro KURODA
This paper describes a top-down design methodology to optimize resonant capacitance in a wireless power transfer system with 3-D stacked two receivers. A 1:2 selective wireless power transfer is realized by a frequency/time division multiplexing scheme. The power transfer function is analytically formulated and the optimum tuning capacitance is derived, which is validated by comparing with system simulation results. By using the optimized values, power transfer efficiencies at 6.78MHz and 13.56MHz are simulated to be 80% and 84%, respectively, which are <3% worse than a conventional wireless power transfer system.
Takayuki SHIBASAKI Hirotaka TAMURA Kouichi KANDA Hisakatsu YAMAGUCHI Junji OGAWA Tadahiro KURODA
This paper describes an 18-GHz coupled VCO array for low jitter and low phase deviation clock distribution. To reduce the skew, jitter and power consumption associated with clock distribution, the clock is generated by a one-dimensional VCO array in which the oscillating nodes of adjacent VCOs are directly connected with wires. The effects of the wire length and number of unit VCOs in the array are discussed. Both 4-unit and a 2-unit VCO arrays for delivering a clock signal to a 16:1 multiplexor were designed and fabricated in a 90-nm CMOS process. The frequency range of the 4-unit VCO array was 16 GHz to 18.5 GHz while each unit VCO consumed 2 mA.
Tadahiro KURODA Tetsuya FUJITA Fumitoshi HATORI Takayasu SAKURAI
This paper describes a Variable Threshold-voltage CMOS technology (VTCMOS) which controls the threshold voltage (VTH) by means of substrate bias control. Circuit techniques to combine a switch circuit for an active mode and a pump circuit for a standby mode are presented. Design considerations, such as latch-up immunity and upper limit of reverse substrate bias, are discussed. Experimental results obtained from chips fabricated in a 0.3 µm VTCMOS technology are reported. VTH controllability including temperature dependence and influence on short channel effect, power penalty caused by the control circuit, substrate current dependence at low VTH, and substrate noise influence on circuit performance are investigated. A scaling theory is also presented for use in the discussion of future possibilities and problems involved in this technology.
Ryota SEKIMOTO Akira SHIKATA Kentaro YOSHIOKA Tadahiro KURODA Hiroki ISHIKURO
An ultra low power and low voltage successive-approximation-register (SAR) analog-to-digital converter (ADC) with timing optimized asynchronous clock generator is presented. By calibrating the delay amount of the clock generator, the DAC settling waiting time is adaptively optimized to counter the device mismatch. This technique improved the maximum sampling frequency by 40% keeping ENOB around 7-bit at 0.4 V analog and 0.7 V digital power supply voltage. The delay time dependency on power supply has small effect to the accuracy of conversion. Decreasing of supply voltage by 9% degrades ENOB only by 0.1-bit, and the proposed calibration can give delay margins for high voltage swing. The prototype ADC fabricated in 40 nm CMOS process achieved figure of merit (FoM) of 8.75-fJ/conversion-step with 2.048 MS/s at 0.6 V analog and 0.7 V digital power supply voltage. The ADC can operates from 50 S/s to 8 MS/s keeping ENOB over 7.5-bit.
Tadahiro KURODA Toshiyuki FUKUNAGA Kenji MATSUO Kazuhiko KASAI Ayako HIRATA Shinji FUJII Masahiro KIMURA Hiroaki SUZUKI
This paper describes a new biasing scheme for sensing circuits, namely an automated bias control (ABC) circuit, for high-performance VLSI's. The ABC circuit can automatically gear the output level of sensing circuits to the input threshold voltage of the succeeding CMOS converters. The sensing performance can be accelerated with the ABC circuit either by reducing excessive signal level margin between the sensing circuits and the CMOS converters or by reducing extra stage of signal amplification. Since feedback control of the ABC circuit ensures a correct dc biasing even under large process deviation and circuit condition changes, wider operation margin can also be obtained. Three successful applications of the ABC circuit are reported: a sense amplifier, an address transition detector (ATD), and an ECL-CMOS input buffer. A 64-kb BiCMOS SRAM employing the proposed sense amplifier and the ATD has been fabricated with a 0.8-µm 9-GHz BiCMOS technology. The SRAM has an address access time of 4.5 ns.
Kota SHIBA Atsutake KOSUGE Mototsugu HAMADA Tadahiro KURODA
This paper describes an in-depth analysis of crosstalk in a high-bandwidth 3D-stacked memory using a multi-hop inductive coupling interface and proposes two countermeasures. This work analyzes the crosstalk among seven stacked chips using a 3D electromagnetic (EM) simulator. The detailed analysis reveals two main crosstalk sources: concentric coils and adjacent coils. To suppress these crosstalks, this paper proposes two corresponding countermeasures: shorted coils and 8-shaped coils. The combination of these coils improves area efficiency by a factor of 4 in simulation. The proposed methods enable an area-efficient inductive coupling interface for high-bandwidth stacked memory.
Yuxiang YUAN Yoichi YOSHIDA Tadahiro KURODA
A wireless power link utilizing inductive coupling is developed between stacked chips. In this paper, we discuss inductor layout optimization and rectifier circuit design. The inductive-coupling power link is analyzed using simple equivalent circuit models. On the basis of the analytic models, the inductor size is minimized for the given required power on the receiver chip. Two kinds of full-wave rectifiers are discussed and compared. Various low-power circuit design techniques for rectifiers are employed to decrease the substrate leakage current, reduce the possibility of latch-up, and improve the power transmission efficiency and the high-frequency performance of the rectifier block. Test chips are fabricated in a 0.18 µm CMOS process. With a pair of 700700 µm2 on-chip inductors, the test chips achieve 10% peak efficiency and 36 mW power transmission. Compared with the previous work the received power is 13 times larger for the same inductor size .
Tadahiro KURODA Takayasu SAKURAI
This paper surveys low-power circuit techniques for CMOS ULSIs. For many years a power supply voltage of 5 V was employed. During this period power dissipation of CMOS ICs as a whole increased four-fold every three years. It is predicted that by the year 2000 the power dissipation of high-end ICs will exceed the practical limits of ceramic packages, even if the supply voltage can be feasibly reduced. CMOS ULSIs now face a power dissipation crisis. A new philosophy of circuit design is required. The power dissipation can be minimized by reducing: 1) supply voltage, 2) load capacitance, or 3) switching activity. Reducing the supply voltage brings a quadratic improvement in power dissipation. This simple solution, however, comes at a cost in processing speed. We investigate the proposed methods of compensating for the increased delay at low voltage. Reducing the load capacitance is the principal area of interest because it contributes to the improvement of both power dissipation and circuit speed. Pass-transistor logic is attracting attention as it requires fewer transistors and exhibits less stray capacitance than conventional CMOS static cicuits. Variations in its circuit topology as well as a logic synthesis method are presented and studied. A great deal of research effort has been directed towards studying every portion of LSI circuits. The research achievements are categorized in this paper by parameters associated with the source of CMOS power dissipation and power use in a chip.
Li-Chung HSU Junichiro KADOMOTO So HASEGAWA Atsutake KOSUGE Yasuhiro TAKE Tadahiro KURODA
ThruChip interface (TCI) is an emerging wireless interface in three-dimensional (3-D) integrated circuit (IC) technology. However, the TCI physical design guidelines remain unclear. In this paper, a ThruChip test chip is designed and fabricated for design guidelines exploration. Three inductive coupling interface physical design scenarios, baseline, power mesh, and dummy metal fill, are deployed in the test chip. In the baseline scenario, the test chip measurement results show that thinning chip or enlarging coil dimension can further reduce TCI power. The power mesh scenario shows that the eddy current on power mesh can dramatically reduce magnetic pulse signal and thus possibly cause TCI to fail. A power mesh splitting method is proposed to effectively suppress eddy current impact while minimizing power mesh structure impact. The simulation results show that the proposed method can recover 77% coupling coefficient loss while only introducing additional 0.5% IR-drop. In dummy metal fill case, dummy metal fill enclosed within TCI coils have no impact on TCI transmission and thus are ignorable.
Mototsugu HAMADA Tadahiro KURODA
This paper describes transmission line couplers for non-contact connecters. Their characteristics are formulated in closed forms and design methodologies are presented. As their applications, three different types of transmission line couplers, two-fold transmission line coupler, single-ended to differential conversion transmission line coupler, and rotatable transmission line coupler are reviewed.
Dongzhu LI Zhijie ZHAN Rei SUMIKAWA Mototsugu HAMADA Atsutake KOSUGE Tadahiro KURODA
A 0.13mJ/prediction with 68.6% accuracy wired-logic deep neural network (DNN) processor is developed in a single 16-nm field-programmable gate array (FPGA) chip. Compared with conventional von-Neumann architecture DNN processors, the energy efficiency is greatly improved by eliminating DRAM/BRAM access. A technical challenge for conventional wired-logic processors is the large amount of hardware resources required for implementing large-scale neural networks. To implement a large-scale convolutional neural network (CNN) into a single FPGA chip, two technologies are introduced: (1) a sparse neural network known as a non-linear neural network (NNN), and (2) a newly developed raster-scan wired-logic architecture. Furthermore, a novel high-level synthesis (HLS) technique for wired-logic processor is proposed. The proposed HLS technique enables the automatic generation of two key components: (1) Verilog-hardware description language (HDL) code for a raster-scan-based wired-logic processor and (2) test bench code for conducting equivalence checking. The automated process significantly mitigates the time and effort required for implementation and debugging. Compared with the state-of-the-art FPGA-based processor, 238 times better energy efficiency is achieved with only a slight decrease in accuracy on the CIFAR-100 task. In addition, 7 times better energy efficiency is achieved compared with the state-of-the-art network-optimized application-specific integrated circuit (ASIC).
Tsutomu TAKEYA Tadahiro KURODA
In this paper, a symbol-rate clock recovery scheme for a receiver that uses an integrating decision feedback equalizer (DFE) is proposed. The proposed clock recovery using expected received signal amplitudes as the criterion realizes minimum mean square error (MMSE) clock recovery. A receiver architecture using an integrating DFE with the proposed symbol-rate clock recovery is also proposed. The proposed clock recovery algorithm successfully recovered the clock phase in a system level simulation only with a DFE. Higher jitter tolerance than 0.26 UIPP at 10 Gb/s operation was also confirmed in the simulation with an 11 dB channel loss at 5 GHz.
Tsutomu TAKEYA Tadahiro KURODA
This paper presents a method of designing transmission line couplers (TLCs) and a mixer-based receiver for dicode partial response communications. The channel design method results in the optimum TLC design. The receiver with mixers and DC balancing circuits reduces the threshold control circuits and digital circuits to decode dicode partial response signals. Our techniques enable low inter-symbol interference (ISI) dicode partial response communications without three level decision circuits and complex threshold control circuits. The techniques were evaluated in a simulation with an EM solver and a transistor level simulation. The circuit was designed in the 90-nm CMOS process. The simulation results show 12-Gb/s operation and 52mW power consumption at 1.2V.